The Data Flow

Let’s learn about the data flow of clean architecture.

In this chapter, we’ll introduce a very simple system designed with clean architecture. The purpose of this introductory chapter is to familiarize ourselves with the main concepts of clean architecture, such as separation of concerns and inversion of control, which are both very important in system design.

In this chapter, as we describe how data flows in the system, we’ll purposefully omit details so that we may focus on the global idea and not worry too much about its implementation. We’ll look more closely at our example again—including why we make specific choices—in the chapters that follow. For now, we’ll focus on the big picture.

For this course, we’ve chosen to design a simple web application that provides a system for renting rooms. Let’s call our application Rent-o-Matic. The URL http://www.rentomatic.com is where our application is located.

Let’s now imagine that a user wants to see the available rooms on Rent-0-Matic. They open the application on their browser and, using the menus on the website, find the page that contains a list of all the rooms our company rents. We’ll say that the URL is /rooms?status=available for this page. When the user’s browser accesses this URL, an HTTP request is sent to our system. We have a component that waits for HTTP connections called the web framework.

Web framework#

The purpose of the web framework is to understand the HTTP request and to retrieve the data that we need to provide a response. In our example, the request has two crucial parts: the endpoint itself (/rooms) and a single query string parameter (status=available). Endpoints act as commands to our system. So, when a user accesses one of them, they send a signal to our system that a specific service has been requested—in our case, to provide a list of all the rooms that are available for rent.

The web framework interprets the HTTP request

Use case#

The web framework operates according to the HTTP protocol. When the web framework decodes the request, it passes the relevant information to another component that processes it. This other component is called the use case, and it’s one of the most important components of the entire clean architecture system, since it implements business logic.

Business logic

Business logic is a specific algorithm or process that we want to implement to transform data in order to provide a service. When we create a system, we generally have an idea about how it’ll help users, so we formulate our business logic accordingly. It’s important to have this idea because it helps us decide how data should be processed, extracted, and presented. The following functions fall under examples of business logic:

  • A search engine finds all web pages related to a query.
  • A social networking website displays the posts of people we follow and sorts them according to a specific algorithm.
  • A travel company finds the best places for us to travel to (from point A to point B).

The use case implements a very specific part of business logic. In our example, we have a use case to search for available rooms, which is a value given in the status parameter. This means that the use case must extract all the rooms managed by our company and filter them to show only the available ones.

Some people may ask why we don’t employ the web framework to do this. The answer is that one of the main traits of a good system architecture is that it separates concerns. That is, it keeps the different responsibilities and domains separated. The web framework is there to process the HTTP protocol. It’s maintained by programmers concerned with that specific part of the system. It becomes complicated if we add business logic to the web framework because it mixes two very different fields.

Separation of concerns is an important concept to remember, because different parts of a system manage different parts of the process. When two separate parts of a system work on the same data or on the same part of a process, they become coupled. Although sometimes coupling is unavoidable, the higher the coupling is between two components, the harder it is to change one without affecting the other. Because of this, it’s best to avoid coupling as much as possible.

As we’ll see, if we separate layers, it allows us to maintain the system with less effort. This will make individual parts of it more testable and easily replaceable.

Let’s return back to our example. The use case in our system needs to fetch all the rooms that are available. This is an example of business logic. It’s very straightforward in this case, since it consists of simple filtering based on the values of an attribute. However, this might not always be the case. An example of more advanced business logic is placing orders based on a system of recommendations, since it may require the use case to connect with more components than just the data source.

Storage system#

The information that the use case wants to process is stored somewhere. We can call this component the storage system. It can be in the form of a database, including a relational database. But that’s only one of the possible data sources. It may be a file, a database, a network endpoint, or a remote sensor. A source is defined as anything that can provide data that the use case can access.


When designing a system, it’s important to think in terms of abstractions. Together with components, abstractions serve as the building blocks of a system. A component plays a role in a system, regardless of the specific implementation of that component. The higher the level of abstraction of a component, the less detailed it is. High-level abstractions can’t be used to consider practical problems, which is why abstract designs have to be implemented using specific solutions or technologies.

For the sake of simplicity, let’s use a relational database like PostgreSQL in our example, since it’s likely to be a familiar database to many of us. However, it’s important to keep in mind that it’s not the only database we can use as a solution to our problem.

The storage system

How does the use case connect with the storage system? If we hard code into the use case the calls to a specific system (for example, SQL), the two components will become strongly coupled. However, as we have mentioned above, this is something that we try to avoid in system design. Coupled components are tightly connected, and if we make changes to one, it forces the other to be changed. This also means that it’s more difficult to test the components, since one can’t live without the other. When one of the components is a complex system like a database, this limitation can severely slow down development.

For example, let’s assume that our use case calls a specific Python library—psycopg—to access PostgreSQL directly. This couples the use case with that specific source. Basically, a change of database results in a change of code. This is not ideal, since the use case contains our business logic and doesn’t change when it moves from one database system to the other. Parts of the system that don’t contain business logic need to be treated as implementation details.

Implementation Details are specific solutions or technologies that are not central to the design as a whole. The word doesn't refer to the subject’s complexity, which may be greater than the parts that are more central to the design.

A relational database is far richer and more complex than an HTTP endpoint, which, in turn, is more complex than an ordered list of objects. The application’s core is the use case, not the way we store data or the way we provide access to it. Usually, implementation details are connected with performance or usability, while the core parts purely implement our business logic.

How can we avoid strong coupling? A simple solution is called inversion of control. In the following lesson, we’ll briefly outline the use of inversion of control. When we implement inversion of control into our example later on in this course, we’ll look at it in more detail.

Inversion of control#

Inversion of control is a technique that’s used to avoid strong coupling between components of a system. It involves wrapping the components so that they expose a certain interface. A component expecting that interface can then connect to the wrapped components via the interface without knowing the details of the specific implementation. In this way, the component can be coupled to the interface instead of the specific implementation.

Inversion of control happens in two phases. First, the called object (that is, the database in our example) is wrapped with a standard interface. This is a set of functionalities shared by every implementation of the target. Each standard interface translates the functionalities into calls to the specific language of the wrapped implementation.

A real-world example of this is electrical plugs. Electric appliances are not designed to be used with specific electrical plugs but with any plug that has the right specifications. This is determined by the part of the world where they’re expected to be used. For instance, when we buy a TV in the UK, we expect it to come with a UK plug (BS 1363). If it doesn’t, then we can use an adapter to convert it and plug it into the electrical outlets that are most commonly found in the UK. Just like in inversion of control, we connect the use case (TV) to a database (electrical outlet) that hasn’t been designed to match a common interface.

In our example for this course, we need the use case to extract all rooms with thee status available, so the database wrapper needs to provide a single entry point. Let’s assign the name list_rooms_with_status to this single entry point.

The storage interface

In the second phase of inversion of control, the caller (the use case) is modified to avoid hard coding the call to the specific implementation, so they don’t become coupled. The use case accepts an incoming object as a parameter of its constructor and receives a concrete instance of the adapter at the time of creation. The specific technique used to implement this depends greatly on the programming language we use. Python doesn’t have an explicit interface syntax, so we’ll assume that the object we pass implements the required methods.

Inversion of control on the storage interface

The use case is now connected with the adapter and knows the interface. It can now call the entry point list_rooms_with_status and pass the status available to look for available rooms on our website. The adapter knows the details of the storage system, so it takes the method call and the parameter that extracts the requested data in a specific call (or set of calls). It then converts them into the format expected by the use case. For example, it may return a Python list of dictionaries that represent rooms.

The business logic extracts data from the storage

At this point, the use case has to apply the rest of our business logic, if needed, and return the result to the web framework.

Business logic returns processed data to the web framework

The web framework converts data received from the use case into an HTTP response. In this case, we have an endpoint that the website user should reach explicitly. The web framework will return an HTML page in the body of the response.

The web framework returns the data in an HTTP response

If this were instead an internal endpoint called by some asynchronous JavaScript code on the frontend, the response’s body would likely be a JSON structure.

Clean Architecture

Advantages of a Layered Architecture